Comparison of chironomic stylization versus statistical modeling of prosody for expressive speech synthesis

نویسندگان

  • Marc Evrard
  • Samuel Delalez
  • Christophe d'Alessandro
  • Albert Rilliard
چکیده

Chironomic stylization is the process of real-time modification of intonation contours (f0 and tempo) using drawing/writing gestures with a stylus on a graphic tablet. The question addressed in this research is whether hand-made intonation stylization could improve or degrade expressivity and overall quality, compared to statistical modeling of prosody. A system for expressive TTS in French based on HMMwas designed. A neutral corpus and six expressive speech corpora were used (anger, fear, joy, sadness, sensuality, surprise). Five sentences were synthesized with the six types of expressivity through CMLLR adaptation. Using a chironomic system, three trained subjects were asked to modify synthetic sentences, aiming at improving their expressive quality. Natural, HMM-TTS, and HMMTTS-Chironomic sentences were evaluated in an expressivity recognition test and a MOS test. The results show that chironomic modification brings significant improvements in both recognition and MOS tests. These results are discussed in detail, together with the effects of voice quality on the perception of HMM-TTS expressive speech. The two main conclusions are: (i) intonation of HMM-TTS can be significantly improved; (ii) hand-corrected TTS improves expressivity and overall quality. Chironomic stylization is a powerful tool lying between fully automatic TTS and recorded speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MeLos: Analysis and Modelling of Speech Prosody and Speaking Style

This thesis addresses the issue of modelling speech prosody for speech synthesis, and presents MeLos: a complete system for the analysis and modelling of speech prosody “the music of speech”. Research into the analysis and modelling of speech prosody has increased dramatically in recent decades, and speech prosody has emerged as a crucial concern for speech synthesis. The issue of speech prosod...

متن کامل

Prosody Annotation for Unit Selection Tts Synthesis

This paper concerns prosody annotation and intonation modeling, especially for the application in a corpus based speech synthesis. In order to establish the rules of the automatic intonation modeling, a four hour fully annotated speech database has been acoustically and perceptually analyzed. The speech material included different text types, dialogs and prosodically rich phrases. As the result...

متن کامل

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

Effects of Pitch Contours Stylization and Time Scale Modification on Natural Speech Synthesis

This paper describes the method of generation of intonated speech for natural speech synthesis using prosody generation model. The effect of pitch modification through pitch contour stylization for parameter extraction and time scale modification for it’s implementation has been mentioned. An approach for close-copy syllabic stylization has been described. In the latter part, algorithm for impl...

متن کامل

Modeling of Fundamental Frequency Contour of Thai Expressive Speech using Fujisaki’s Model and Structural Model

Problem statement: In spontaneous speech communication, prosody is an important factor that must be taken into account, since the prosody effects on not only the naturalness but also the intelligibility of speech. Focusing on synthesis of Thai expressive speech, a number of systems has been developed for years. However, the expressive speech with various speaking styles has not been accomplishe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015